

1 ELECTRONIC WATERMARKING METHOD AND APPARATUS FOR COMPRESSED 



3 Field of the Invention 

4 The present invention relates to a method and a system for 

5 embedding, detecting and updating additional information, 

6 such as copyright information, relative to compressed 

7 digital audio data, and relates in particular to a technique 

8 whereby an operation equivalent to an electronic 

9 watermarking technique performed in a frequency domain can 

10 be applied for compressed audio data. 

11 Background Art 

12 As a technique for the electronic watermarking of audio 

13 data, there is a Spread Spectrum method, a method for 

14 employing a polyphase filter, or a method for transforming 

15 data in a frequency domain and for embedding the resultant 

16 data. The method for embedding and detecting information in 

17 the frequency domain has merit in that an auditory 

18 psychological model can be easily employed, in that high 

19 tone quality can be easily provided and in that the 

20 resistance to transformation and noise is high. However, 

21 the target for the conventional audio electronic 

22 watermarking technique is limited to digital audio data that 

23 is not compressed. For the Internet distribution of audio 
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1 data, generally the audio data are compressed, because of 

2 the limitation imposed by the communication capacity, and 

3 the compressed data are transmitted to users. Thus, when 

4 the conventional electronic watermarking technique is 

5 employed, it is necessary for the compressed audio data be 

6 decompressed, for the obtained data to be embedded and for 

7 the resultant data to be compressed again. The calculation 

8 time required for this series of operations is extended for 

9 the advanced audio compression technique that implements 

10 both high tone quality and high compression efficiency* How 

11 long it takes before a user can listen to audio data greatly 

12 effects the purchase intent of a user. Therefore, there is 

13 a demand for a process whereby the embedding, changing or 

14 updating of additional information can be performed while 

15 the audio data are compressed. However, there is presently 

16 no known method available for embedding additional 

17 information directly into compressed digital audio data, and 

18 for changing or detecting the additional information. 

19 SUMMARY OF THE INVENTION 

20 To resolve the above shortcoming, it is one object of the 

21 present invention to provide a method and a system with 

22 which information embedded in compressed digital audio data 

23 can be directly operated. 

24 It is one more object of the present invention to provide a 

25 method and a system with which additional information can be 
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1 embedded in compressed digital audio data. 

2 It is another object of the present invention to provide a 

3 method and a system for which only a small memory capacity 

4 is required in order to embed additional information in 

5 digital audio data. 

6 It is an additional object of the present invention to 

7 provide a method and a system with which minimized 

8 additional information can be embedded in digital audio 

9 data. 

10 It is a further object of the present invention to provide a 

11 method and a system with which additional information 

12 embedded in compressed digital audio data can be detected 

13 without the decompression of the audio data being required. 

14 It is yet one more object of the present invention to 

15 provide a method and a system with which additional 

16 information embedded in compressed digital audio data can be 

17 changed without the decompression of the audio data being 

18 required. 

19 BRIEF DESCRIPTION OF THE DRAWINGS: 

20 These and other aspects, features , and advantages of the 

21 present invention will become apparent upon further 

22 consideration of the following detailed description of the 
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1 invention when read in conjunction with the following 

2 drawing . 

3 Fig. 1 is a block diagram illustrating an apparatus for 

4 embedding additional information directly in compressed 

5 audio data. 

6 Fig. 2 is a diagram showing an example for a window length 

7 and a window function. 

8 Fig. 3 is a diagram showing the relationship existing 

9 between a window function and MDCT coefficients. 

10 Fig. 4 is a block diagram of an MDCT domain that corresponds 

11 to a frame along a time axis. 

12 Fig. 5 is a specific diagram showing a sine wave. 

13 Fig. 6 is a diagram showing an example for embedding 

14 additional information in an adjacent frame. 

15 Fig. 7 is a diagram showing a portion of a basis for which 

16 the MDCT has been performed. 

17 Fig. 8 is a diagram showing an example of the separation of 

18 a basis. 

19 Fig. 9 is a block diagram showing an additional information 

20 embedding system according to the present invention. 



DOCKET NUMBER: JA919990075US1 



-4- 



1 Fig. 10 is a block diagram showing an additional information 

2 detection system according to the present invention. 

3 Fig. 11 is a block diagram showing an additional information 

4 updating system according to the present invention. 

5 Fig. 12 is a diagram showing the general hardware 

6 arrangement of a computer. 

7 Description of the Symbols 
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11 


5: 


Keyboard/mouse controller 


12 


6: 


Keyboard 


13 


7: 


Pointing device 


14 


8: 


Display adaptor card 


15 


9: 


Video memory 


16 


10 


: DAC/LCDC 


17 
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: Display device 
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: CRT display 


19 
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: Hard disk drive 


20 
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: ROM 


21 
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■ Serial port 


22 
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Parallel port 
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Timer 
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Communication adaptor 
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Floppy disk controller 
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1 


20: 


Floppy disk drive 


2 


21 : 


Audio controller 
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22 : 


Amplif ier 
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23 : 


Loudspeaker 
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24 : 


Microphone 
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25: 


IDE controller 
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26: 


CD-ROM 
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27 : 


SCSI controller 
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28 : 


MO 


10 


29: 


CD-ROM 


11 


30: 


Hard disk drive 


12 


31 : 


DVD 


13 


32 : 


DVD 


14 


100: 


: System 



15 DETAILED DESCRIPTION OF THE INVENTION: 

16 Additional information embedding system 

17 To achieve the above objects, according to the present 

18 invention, a system for embedding additional information in 

19 compressed audio data comprises: 

20 (1) means for extracting MDCT (Modified Discrete Cosine 

21 Transform) coefficients from the compressed audio data; 

22 (2) means for employing the MDCT coefficients to calculate a 

23 frequency component for the compressed audio data; 
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1 (3) means for embedding additional information in the 

2 frequency component obtained in a frequency domain; 

3 (4) means for transforming into MDCT coefficients the 

4 frequency component in which the additional information is 

5 embedded; and 

6 (5) means for using the MDCT coefficients, in which the 

7 additional information is embedded, to generate compressed 

8 audio data. 

9 Additional information updating system 

10 Further, according to the present invention, a system for 

11 updating additional information embedded in compressed audio 

12 data comprises: 

13 (1) means for extracting MDCT coefficients from the 

14 compressed audio data; 

15 (2) means for employing the MDCT coefficients to calculate a 

16 frequency component for the compressed audio data; 

17 (3) means for detecting the additional information in the 

18 frequency component that is obtained; 

19 (3-1) means for changing, as needed, the additional 

20 information for the frequency component; 
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1 (4) means for transforming into MDCT coefficients the 

2 frequency component in which the additional information is 

3 embedded; and 

4 (5) means for using the MDCT coefficients, in which the 

5 additional information is embedded, to generate compressed 

6 audio data. 

7 Additional information detection system 

8 Further, according to the present invention, a system for 

9 detecting additional information embedded in compressed 

10 audio data comprises: 

11 (1) means for extracting MDCT coefficients from the 

12 compressed audio data; 

13 (2) means for employing the MDCT coefficients to calculate a 

14 frequency component for the compressed audio data; and 

15 (3) means for detecting the additional information in the 

16 frequency component that is obtained. 

17 It is preferable that the means (2) calculate the frequency 

18 component for the compressed audio data using a precomputed 

19 table in which a correlation between MDCT coefficients and 

20 frequency components is included. 

21 It is also preferable that the means (4) transforms the 
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1 frequency component into the MDCT coefficients by using a 

2 precomputed table that includes a correlation between MDCT 

3 coefficients and frequency components. 

4 In addition, it is preferable that the means (3) for 

5 embedding the additional information in the frequency domain 

6 divide an area for embedding one bit by the time domain, and 

7 calculate a signal level for each of the individual obtained 

8 area segments, while embedding the additional information in 

9 the frequency domains in accordance with the lowest signal 

10 level available for each frequency. 

11 Correlation table generation method 

12 According to the present invention, for at least one window 

13 function and one window length employed for compressing 

14 audio data, a method for generating a table including a 

15 correlation between MDCT coefficients and frequency 

16 components comprises: 

17 (1) a step of generating a basis which is used for 

18 performing a Fourier transform for a waveform along a time 

19 axis; 

20 (2) a step of multiplying a window function by a 

21 corresponding waveform that is generated by using the basis; 

22 (3) a step of performing an MDCT process, for the result 

23 obtained by the multiplication of the window function, and 



DOCKET NUMBER: JA919990075US1 



-9- 



1 of calculating an MDCT coefficient; and 

2 (4) a step of correlating the basis and the MDCT 

3 coefficient. The example basis can be a sine wave and a 

4 cosine wave. 

5 Operation of additional information embedding system 

6 The system for embedding additional information in 

7 compressed audio data, first extracts compressed MDCT 

8 coefficients from compressed digital audio data. Then, the 

9 system employs MDCT coefficients sequence that have been 

10 calculated and stored in a table in advance to obtain the 

11 frequency component of the audio data. Thereafter, the 

12 system employs the method for embedding additional 

13 information in a frequency domain to calculate an embedded 

14 frequency signal, and subsequently, the system employs the 

15 table to transform the embedded frequency signal into a MDCT 

16 coefficient, and adds the obtained MDCT coefficient to the 

17 MDCT coefficient of the audio data. The resultant MDCT 

18 coefficients are defined as new MDCT coefficients for the 

19 audio data, and are again compressed; the resultant data 

20 being regarded as watermarked digital audio data. 

21 According to the method of the invention for embedding the 

22 minimum data, a frame for the embedding therein of one bit 

23 is divided at a time domain, a signal level is calculated 

24 for each of the frame segments, and the upper embedding 

25 limit is obtained in accordance with the lowest signal level 
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1 available for each frequency. 

2 Operation performed for correlation table 

3 A table for correlating the MDCT coefficient and the 

4 frequency component is obtained in which representation of 

5 each basis of a Fourier transformation relative to the MDCT 

6 coefficient is calculated in advance in accordance with a 

7 frame length (a window function and a window length) . Thus, 

8 an operation on the compressed audio data can be performed 

9 directly. 

10 The means for reducing the memory size that is required for 

11 the correlation table employs the periodicity of the basis, 

12 such as a sine wave or a cosine wave, to prevent the storage 

13 of redundant information. Or, instead of storing in the 

14 table the MDCT results obtained for the individual bases 

15 using the Fourier transformation, each basis is divided into 

16 several segments, and corresponding MDCT coefficients are 

17 stored so that the memory size required for the table can be 

18 reduced. 

19 Operation of additional information detection system 

20 The system of the invention employed detecting additional 

21 information in compressed audio data, recovers coded MDCT 

22 coefficients and employs the same table as is used for the 

23 embedding system to perform a process equivalent to the 

24 detection in the frequency domain and the detection of bit 
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1 information and a code signal. 

2 Operation of additional information updating system 

3 The system of the invention, used for updating additional 

4 information embedded in compressed audio data, recovers the 

5 coded MDCT coefficients and employs the same method as the 

6 detection system to detect a signal embedded in the MDCT 

7 coefficients. Only when the strength of the embedded signal 

8 is insufficient, or when a signal that differs from a signal 

9 to be embedded is detected and updating is required, the 

10 same method is employed as that used by the embedding system 

11 to embed additional information in the MDCT coefficients. 

12 The newly obtained MDCT coefficients are thereafter recorded 

13 so that they can be employed as updated digital audio data. 

14 Preferred Embodiment 

15 First, definitions of terms will be given before the 

16 preferred embodiment of the invention is explained. 

17 Sound compression technique 

18 Compressed data for the present invention are electronic 

19 compressed data for common sounds, such as voices, music and 

20 sound effects. The sound compression technique is well 

21 known as MPEG1 or MPEG2 . In the specification, this 

22 compression technique is generally called the sound 

23 compression technique, and the common sounds are described 
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1 as sound or audio, 

2 * Compressed state 

3 The compressed state is the state wherein the amount of 

4 audio data is reduced by the target sound compression 

5 technique, while deterioration of the sound is minimized. 

6 * Non-compressed state 

7 The non-compressed state is a state wherein an audio 

8 waveform, such as a WAVE file or an AIFF file, is described 

9 without being processed. 

10 * Decode the compressed state 

11 This means "convert from the compressed state of the audio 

12 data to the non-compressed state." This definition is also 

13 applied to "shifting to the non-compressed state." 

14 * MDCT transform (Modified Discrete Cosine Transform) 

15 Equation 1 

16 [All the equations are tabulated at the end of the text of 

17 this description, just before the claims.]- 

18 Xn denotes a sample value along the time axis, and n is an 

19 index along the time axis. 
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1 Mk denotes a MDCT coefficient, and k is an integer of from 0 

2 to (N/2)-l, and denotes an index indicating a frequency. 

3 In the MDCT transform, the sequence XO to X(N-l) along the 

4 time axis are transformed into the sequence MO to M((N/2)-l) 

5 along the frequency axis. While the MDCT coefficient 

6 represents one type of frequency component, in this 

7 specification, the "frequency component" means a coefficient 

8 that is obtained as a result of the DFT transform. 

9 * DFT transform (Discrete Fourier Transform) 

10 Equation 2 

11 Xn denotes a sample value along the time axis, and n denotes 

12 an index along the time axis. 

13 Rk denotes a real number component (cosine wave component) ; 

14 Ik denotes an imaginary number component (sine wave 

15 component); and k is an integer of from 0 to (N/2)-l, and 

16 denotes an index indicating a frequency. The discrete 

17 fourier transform is a transformation of the sequence XO to 

18 X(N-l) along the time axis into the sequences R0 to 

19 R((N/2)-l), and 10 to I((N/2)-l) along the frequency axis. 

20 In this specification, "frequency component" is the general 

21 term for the sequences Rk and Ik. 

22 * Window function 
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1 This function is to be multiplied by the sample value before 

2 the MDCT is performed. Generally, the sine function or the 

3 Kaiser function is employed. 

4 * Window length 

5 The window length is a value that represents the shape or 

6 length of a window function to be multiplied with data in 

7 accordance with the characteristic of the audio data, and 

8 that indicates whether the MDCT should be performed for 

9 several samples. 

10 Fig. 1 is a block diagram showing the processing performed 

11 by an_ apparatus for directly embedding additional 

12 information in compressed audio data. A block 110 is a 

13 block for extracting MDCT coefficients sequence from 

14 compressed audio data that are entered. A block 120 is a 

15 block for employing the extracted MDCT coefficients to 

16 calculate the frequency component of the audio data. A 

17 block 130 is a block for embedding additional information in 

18 the obtained frequency component of a frequency domain. A 

19 block 140 is a block for transforming the frequency 

20 component using the additional information embedded in an 

21 MDCT coefficient. And finally, a block 150 is a block for 

22 generating compressed audio data by using the MDCT 

23 coefficient obtained by the block 140. 

24 The blocks 120 and 130 employ a correlation table for the 
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1 MDCT coefficient and the frequency to perform a fast 

2 transform. In this invention, the representations of the 

3 bases of the Fourier transform in the MDCT domain are 

4 entered in advance in the table, and are employed for the 

5 individual embedding, detection and updating systems. An 

6 explanation will now be given for the correlation table for 

7 the MDCT coefficient and the frequency and the generation 

8 method therefor, the systems used for embedding, detecting 

9 and updating compressed audio data, and other associated 

10 methods . 

11 Correlation table for MDCT coefficients and frequency 

12 components 

13 Audio data must be transformed into a frequency domain in 

14 order to employ an auditory psychological model for 

15 embedding calculation. However, a very extended calculation 

16 time is required to perform inverse transformations, for the 

17 audio data that are represented as MDCT coefficients, and to 

18 perform the Fourier transforms for audio data at the time 

19 domain. Thus, a correlation between the MDCT coefficients 

20 and the frequency components is required. 

21 If the audio data are compressed by performing the MDCT for 

22 a constant number of samples without a window function, the 

23 MDCT employs the cosine wave with a shifted phase as a 

24 basis. Therefore, the difference from a Fourier transform 

25 consists only of the shifting of a phase, and a preferable 

26 correlation can be expected between the MDCT domain and the 
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1 frequency domain. However, to obtain improved tone quality, 

2 the latest compression technique changes the shape or the 

3 length of the window function to be multiplied (hereinafter 

4 refereed to as a window length) in accordance with the 

5 characteristic of the audio data. Thus, a simple 

6 correlation between a specific frequency for the MDCT and a 

7 specific frequency for a Fourier transform can not be 

8 obtained, and since the correlation can not be acquired 

9 through calculation, it must be stored in a table. 

10 Fig. 2 is a diagram showing window length and window 

11 function examples. While this invention can be applied for 

12 various compressed data standards, in this embodiment, the 

13 MPEG2 standards are employed. For MPEG2 AAC (Advanced Audio 

14 Coding) , for example, a window function normally having a 

15 window length of 2048 samples is multiplied to perform the 

16 MDCT. For a portion where sound is drastically altered, a 

17 window function having a window length of 256 samples is 

18 multiplied to perform the MDCT, so that a type of 

19 deterioration called pre-echo is prevented. A normal frame 

20 for which 2048 samples is a unit is called an 

21 ONLY_LONG__SEQUENCE, and is written using 1024 MDCT 

22 coefficients that are obtained from one MDCT process. A 

23 frame for which 256 samples is a unit is called an 

24 EIGHT_SHORT_SEQUENCE, and is written using eight pairs of 

25 MDCT 128 coefficients that are obtained by repeating the 

26 MDCT eight times, for 256 samples each time, with each frame 

27 half overlapping its adjacent frame. Further, asymmetric 

28 window functions called a LONG START SEQUENCE and a 
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1 LONG_STOP_SEQUENCE are also employed to connect the above 

2 frames. 

3 Fig. 3 is a diagram showing the correlation between the 

4 window functions and the MDCT coefficients sequence. For 

5 the MPEG2 AAC, the window functions are multiplied by the 

6 audio data along the time axis, for example, in the order 

7 indicated by the curves in Fig. 3, and the MDCT coefficients 

8 are written in the order indicated by the thick arrows. 

9 When the window length is varied, as in this example, the 

10 bases of a Fourier transform can not simply be transformed 

11 into a number of MDCT coefficients. 

12 Therefore, to embed additional information, the correlation 

13 table of this invention does not depend on the window 

14 function (a signal added during the additional information 

15 embedding process should not depend on a window function 

16 when the signal is decompressed and developed along the time 

17 axis) . Therefore, when an embedding method is employed that 

18 depends on the shape of the window function and the window 

19 length, the embedding and the detection of the compressed 

20 audio data can be performed, and the window function that is 

21 used can be identified when the data are decompressed. 

22 The correlation table of the invention is generated so that 

23 frames in which additional information is to be embedded do 

24 not interfere with each other. That is, in order to embed 

25 additional information, the MDCT window must be employed as 

26 a unit, and when the data are developed along the time axis, 
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1 one bit must be embedded in a specific number of samples, 

2 which together constitute one frame. Since for the MDCT, 

3 target frames for the multiplication of a window overlap 

4 each other 50%, a window that extends over a plurality of 

5 frames is always present (a block 3 in Fig. 4 corresponds to 

6 such a window) . When additional information is simply 

7 embedded in one of these frames, it affects the other 

8 frames. And when data embedding is not performed, the data 

9 embedding intensity is reduced, as is detection efficiency. 

10 Signals indicating different types of additional information 

11 are embedded in the first and the second halves of a frame. 

12 The correlation table is employed when a frequency component 

13 is to be calculated using the MDCT coefficient to embed 

14 additional information, when an embedded signal obtained at 

15 the frequency domain is to be again transformed into an MDCT 

16 coefficient, and when a calculation corresponding to a 

17 detection in a frequency domain is to be performed in the 

18 MDCT domain. Since the detection and the embedding of a 

19 signal are performed in order during the updating process, 

20 all the transforms described above are employed in the 

21 updating process. 

22 Method for generating a correlation table when the length of 

23 a window function is unchanged 

24 First, an explanation will be given for the table generation 

25 method when a window length is constant, and for the 

26 detection and embedding methods that use the table. These 
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1 methods will be extended later for use by a plurality of 

2 window lengths. Assume that the window function is 

3 multiplied along the time axis by audio data consisting of N 

4 samples and the MDCT is performed to obtain N/2 MDCT 

5 coefficients, and that N/2 MDCT coefficients are employed 

6 and written as one block (i.e., a constant window length is 

7 defined as N samples) . Hereinafter, if not specifically 

8 noted, the term "block" represents N/2 MDCT coefficients. 

9 The audio data along the time axis that correspond to two 

10 sequential blocks are those where there is a 50%, i.e., N/2 

11 samples, overlap. 

12 The target of the present invention is limited to an 

13 embedding ratio for the embedding of one bit in relative 

14 samples integer times N/2. In this embodiment, the number 

15 of samples required along the time axis to embed one bit is 

16 defined as nxN/2, which is called one frame. Due to the 

17 previously mentioned 50% overlapped property there is also a 

18 block that is extended across two sequential frames along 

19 the time axis. Fig. 4 is a specific diagram showing two 

20 - frames extended along the time axis when n=2 that correspond 

21 to five blocks in the MDCT domain. The audio data along the 

22 time axis are shown in the lower portion in Fig. 4, the MDCT 

23 coefficients sequence are shown in the upper portion, and 

24 elliptical arcs represent the MDCT targets. Block 3 is a 

25 block extending half way across Frame 1 and Frame 2. 

26 Since the embedding operation is performed for the 

27 independent frames, the correlation between the frequency 
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1 component and the MDCT coefficient for each frame need only 

2 be required for the table. In other words, adjacent frames 

3 in which embedding is performed should not affect each 

4 other. Therefore, for each basis of a Fourier transform 

5 having a cycle of N/ (2xm) , the MDCT coefficients sequence 

6 obtained using the following methods are employed to prepare 

7 a table. In this case, m is an integer equal to or smaller 

8 than N/2. Fig. 5 is a diagram showing a sine wave for n=2 

9 and m=l . 

10 There are n+1 blocks present that are associated with one 

11 frame, and the first and the last blocks also extend into 

12 the respective succeeding and preceding frames (blocks 1 and 

13 3 in Fig. 5) . Thus, assume a waveform (the thick line 

14 portion in Fig. 5) is obtained by connecting N/2 samples 

15 having a value of 0 before and after the basis waveform that 

16 has an amplitude of 1.0 and a length equivalent to one 

17 frame. When a window function (corresponding to an 

18 elliptical arc in Fig. 5) is multiplied by N samples, while 

19 50% of the first part of the waveform is overlapped, and the 

20 MDCT is performed, this waveform can be represented by using 

21 the MDCT coefficients. If the IMDCT is performed for the 

22 obtained MDCT coefficients sequence, the preceding and 

23 succeeding N/2 samples have a value of 0. 

24 Fig. 6 is a diagram showing an example wherein additional 

25 information is embedded in adjacent frames. When samples 

26 having a value of 0 are added as shown in Fig. 6, the 

27 interference produced by embedding performed in adjacent 
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1 frames can be prevented. In the data detection process and 

2 the frequency component calculation process, detection 

3 results and frequency components can be obtained that are 

4 designated for a pertinent frame and that are not affected 

5 by preceding and succeeding frames. If a value of 0 is not 

6 compensated for, adjacent frames affect each other in the 

7 embedding and detection process. 

8 The processing performed to prepare the table is as follows. 

9 Step 1: First, calculations are performed for a cosine wave 

10 having a cycle of N/2xn/k, an amplitude of 1.0 and a length 

11 of N/2xn. This cosine wave corresponds to the k-th basis 

12 when a Fourier transform is to be performed for the N/2xn 

13 samples. 

14 f(x) = cos (2ti/ (N/2xn/k)xx) (0^x<N/2xn) 

15 = cos ( 4k7i/ (Nxn) xx) 

16 Step 2: N/2 samples having a value of 0 are compensated for 

17 at the first and the last of the waveform (Fig. 5) . 



18 g(y) = 0 



(0^y<N/2) 



19 



f (y-N/2) 



(N/2gy<N/2x (n+1) ) 



20 



0 



(N/2x (n+1) ^y<N/2x (n+2) ) 
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1 Step 3: The samples N/2x(b-l)th to N/2x(b+l)th are 

2 extracted. Here b is an integer of from 1 to n+1, and for 

3 all of these integers the following process is performed. 



5 Step 4: The results are multiplied by a window function. 



7 function) 

8 Step 5: The MDCT process is performed, and the obtained N/2 

9 MDCT coefficients are defined as vectors V r , b/k . 

10 V r , b , k = MDCT (h b (z) ) 

11 Since the MDCT transform is an orthogonal transform and each 

12 basis of a Fourier transform is a linear independence, V r/b/k 

13 are orthogonal for a k having a value of 1 to N/2. 

14 Step 6: V r , b , k is obtained for all the combinations (k, b) , 

15 and each matrix T r/b is formed. 

16 T r/b = (V r , b , 1/ V r , b ,2/ V r , b ,3f . . . V r , b , N / 2 ) 

17 The vector that is obtained for a sine wave using the same 

18 method is defined as vi, b, k, and the matrix is defined as 

19 Ti, b. Each sequence is an MDCT coefficient sequence that 

20 represents the sine wave of a value of 1. Since there are 1 



4 h b (z) = g(z+N/2x (b-1) 



(0gz<N) 



6 h b (z) = h b (z)xwin(z) 



(0^z<N, win(z) is a window 
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1 to n+1 blocks, 2 x (n+1) matrixes are obtained. 

2 Transform from a frequency domain into an MDCT domain 

3 Assume that the audio data in the frequency domain are 

4 represented as R + jl, where j denotes an imaginary number 

5 element, R denotes a real number element and I is the N/2th 

6 order real number vector that represents an imaginary number 

7 element. The k element corresponds to a basis having a 

8 cycle of (N/2) x n/k samples. The MDCT coefficient sequence 

9 Mb is obtained as the sum of the vectors of MDCT 

10 coefficients sequence, which is obtained by transforming 

11 each frequency component separately into an MDCT domain, and 

12 can be represented as M b = T r , b + Ti, b I. In this case, b is an 

13 integer of from 1 to n+1, and corresponds to each block. Ml 

14 and Mn+1 are MDCT coefficients sequence for a block that 

15 extends across portions of adjacent frame. 

16 Transform from an MDCT domain into a frequency domain 

17 Here, vi,b,k and the vr,b,k are orthogonal to each other and 

18 form an MDCT domain. Thus, when a specific MDCT coefficient 

19 sequence is given, and when the inner product is calculated 

20 for the MDCT coefficient sequence and vr,b,k or vi,b,k, the 

21 element in the corresponding direction of the Mb can be 

22 obtained that represents respectively a real number element 

23 and/or an imaginary number element in the frequency domain. 

24 The MDCT coefficients sequence for (n+1) blocks associated 

25 with one frame are collectively processed to obtain the 
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1 frequency component for the pertinent frame. 

2 Equation 3 

3 Correlation table generation method when a window function 

4 is changed in audio data 

5 Assume that the types of window functions that could be 

6 employed for compression are listed. All the window lengths 

7 are dividers having a maximum window length of N. For a 

8 block having an N/W (W is an integer) sample window length, 

9 assume that the MDCT is repeated for the N/W sample W times, 

10 with 50% overlapping, and that as a result W pairs of N/(2W) 

11 MDCT. coefficients, i.e., a total of N/2 coefficients, are 

12 written in the block. Further, assume that in the first 

13 MDCT process N/W samples beginning with the "offset" sample 

14 in the block are transformed. For example, where for the 

15 EIGHT_SHORT_SEQUENCE of the MPEG2 AAC, N=2 048, W=8 and 

16 offset=448. As a result of repeating the eight MDCT 

17 processes for 256 samples with 50% overlapping, eight pairs 

18 of 128 MDCT coefficients are written along the time axis 

19 (see Figs. 2 and 3) . 

20 Table generation method 

21 The table for the window length N/W is generated as follows. 
22 

23 Step 1: The same as when the length of the window function 
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1 is unchanged. 

2 Step 2: The same as when the length of the window function 

3 is unchanged. 

4 Step 3: The N/W sample corresponding to the W-th window is 

5 extracted. W is an integer of from 1 to W. b is an integer 

6 of from 1 to n+1. The following processing must be 

7 performed for all the combinations of b and w. 

8 h b/W (z) = g (z+N/2x(b-l) +N/2/Wxw+of fset) 

9 (0*z<N/W) 

10 Step 4: The results are multiplied by a window function. 

11 h b ,w(z) = h b/W (z) xwin(z) (0^z<N/W: win(z) is a 

12 window function) 

13 Step 5: The MDCT process is performed, and the obtained 

14 N/(2W) MDCT coefficients are defined as vectors v r/b , k , w . 

15 v r/b , k , w = MDCT (h b , w (z) ) 

16 Step 6: v r , b , k/W are arranged to define v r , b , k . 

17 When v r/b/k , w is obtained for all the "w"s having a value of 1 

18 to W, they are arranged vertically to obtain vector v r , b , k . 



DOCKET NUMBER: JA919990075US1 



-26- 



1 Fig. 7 is a diagram showing the portion of a basis for 

2 which, with n=2, b=2, k=l and W=8, the MDCT process has been 

3 performed to obtain the coefficients v r , 2 ,i, w . 

4 Step 7: The coefficients v r , b/k are obtained for all the 

5 combinations (k, b) , and the coefficients v r/b , k for k having 

6 values of 1 to N/2 are arranged horizontally to constitute 

7 Tw, r ,b- 

8 Since each v r , b , k , w is a vector of N/ (2w) rows by one column, 

9 this matrix is a square matrix of N/2 rows by N/2 columns. 

10 Each column illustrates how a cosine wave having a value of 

11 1 is represented as the MDCT coefficients sequence in the 

12 b-th block having a window length' of N/W. Similarly, the 

13 matrix TW,i,b is obtained in the sine wave. Since from 1 to 

14 n+1 block numbers b are provided, for this window length, 2 

15 x (n+1) matrixes are obtained. In addition, the table is 

16 prepared in accordance with the window length and the types 

17 of window functions. 

18 Transform from the frequency domain to the MDCT domain 

19 The difference from a case where only one type of window 

20 length is employed is that block information is read from 

21 compressed audio data and that a different matrix is 

22 employed in accordance with the window function that is used 

23 for each block. Since the matrix is varied for each block, 

24 the MDCT coefficient sequence Mb is adjusted in order to 

25 cope with the window function and the window length that are 
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1 employed. The waveform, which is obtained when the IMDCT is 

2 performed for the MDCT coefficient sequence Mb in the time 

3 domain, and the frequency component, which is obtained by 

4 performing a Fourier transform in the frequency domain, do 

5 not depend on the window function and the window length. 

6 The MDCT coefficient sequence Mb is obtained using Mb = 

7 T w , r ,bR + T w ,i,bl . 

8 Transform from the MDCT domain to the frequency domain 

9 When T w , r , b is employed instead of T r , b , the transform in the 

10 frequency domain can be performed in the same manner. When 

11 the matrix is changed in accordance with the window function 

12 and the window length, a true frequency component can be 

13 obtained that does not depend on the window function and the 

14 window length. 

15 Equation 4 

16 Method for reducing a memory capacity required for the table 

17 Since the matrix has a size of (N/2)x(N/2), the table 

18 generated by this method is constituted by 2 x (n+1) x (N/2) 

19 x (N/2) = (n+1) x N/2/2 MDCT coefficients (floating-point 

20 numbers). However, since the contents of this table tend to 

21 be redundant, the memory capacity that is actually required 

22 can be considerably reduced. 

23 Method 1: method for using the periodicity of the basis. 
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1 The periodicity of the basis can be employed as one method. 

2 According to this method, since several V r , b ,k are identical, 

3 this portion is removed. 

4 When m is an integer, the cosine wave that is N/2xm samples 

5 ahead is represented as 

6 f(x+N/2xm) = cos (4k7i/ (Nxn) x (x+N/2xm) ) 

7 = cos ( 4k7i/ (Nxn) xx + 4k7t;/ (Nxn) xN/2xm) 

8 = cos ( 4k7i/ (Nxn) xx + 27ikxm/n) . 

9 Therefore, in case a where (kxm)/n is an integer, 

10 f(x+N/2xm) = f(x) (limited to a range 0gxgN/2x (n-m) ) 

11 g(y+N/2xm) = g(y) (limited to a range N/2gygN/2x (n-m+1 ) ) . 

12 Thus, 

13 h b+m (z) = h b (z) (limited to a range 2^b^n-m) , 

14 and 

15 V r , b+m , k = V r , b , k (limited to a range 2gbgn-m) 

16 is obtained. The range is limited because of the range 
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1 defined for f (x) . 

2 In case b where (kxm)/n is an irreducible fraction that can 

3 be represented by integer/2, 

4 f(x+N/2xm) = -f(x) 

5 And 

6 h b+m (z) = -h b (z) . 

7 Thus, 

8 V r , b +m,k ~ ~V r , b , k . 

9 The range limitation is the same as it is for case a. 

10 In case c where (kxm)/n is an irreducible fraction that can 

11 be represented by ( 4xinteger+l ) /4, 

12 f(x+N/2xm) = cos {4k%/ (Nxn) xx + 7i(even number+1/2)) 

13 = -sin (4k7i/ (Nxn) xx) . 

14 Thus, 

16 In case d where (kxm)/n is an irreducible fraction that can 
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1 be represented by ( 4xinteger+3 ) /4 , 



2 f (x+N/2xm) = cos (4ku/ (Nxn) xx + rc(odd number+1/2 ) ) 



3 



= sin ( 4k7i/ (Nxn) xx) . 



4 



Thus, 




6 The range limitation is the same as it is for case a. 

7 Therefore, V r , b+m , k/ which establishes conditions a to d, can 

8 be replaced by another vector, and this is applied to V i/b k . 

9 Thus, instead of storing the matrixes T r , b and T i/b being 

10 unchanged, only the following minimum elements need be 

11 stored. The following minimum elements are as follows. 

12 * vectors V r , b , k and Vi, b/k that do not establish the conditions 

13 a to d 

14 * information concerning the positive or negative sign that 

15 is to be added to a vector that is to be used for each 

16 column in the matrixes T r , b and T i/b . 

17 For the actual transform between the MDCT domain and the 

18 frequency domain, the vectors V r , bf]c and V ifb ,k are employed 

19 instead of the columns in the matrixes T r , b and T i/b to perform 

20 a calculation equivalent to the matrix operation. The 
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1 transform from the frequency domain to the MDCT domain is 

2 represented as follows. 

3 Equation 5 

4 Another appropriate vector is employed for a portion wherein 

5 a vector is standardized. The transform from the MDCT 

6 domain to the frequency domain is performed by obtaining the 

7 following inner product for each frequency component. The 

8 following equation is obtained by separating the equation 

9 used for the matrixes T r/b and T i/b into its individual 

10 components. 

11 Equation 6 

12 Due to the vector standardization, the required memory 

13 capacity depends on "n" to a degree. For example, since 

14 only the condition a is established when n=3, the required 

15 memory capacity is reduced only 8.3%, while when n=4, it is 

16 reduced 40%. 

17 Since the same relation exists between hb and w as when only 

18 one type of window function is provided in a case where the 

19 window function is varied, the above standardization can be 

20 employed unchanged, and when the same condition is 

21 established, the following equation is obtained. 

22 Equation 7 
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1 Method 2: method for separating the basis into preceding and 

2 succeeding segments. 

3 Furthermore, the linearity of the MDCT is employed to 

4 separate the basis of a Fourier transform into individual 

5 segments, and the MDCT coefficients sequence obtained by the 

6 transform are used to form a table. Then, the application 

7 range of the above method 1 can be expanded. Actually, the 

8 sum of the vectors of the MDCT coefficients sequence that 

9 are stored in the table is employed to represent the basis. 

10 Fig. 8 is a diagram showing an example wherein a basis is 

11 separated. 

12 First, a waveform (thick line on the left in Fig. 8) is 

13 divided into the first N/2 sample and the last N/2 sample 

14 for each block. To perform an MDCT for the first N/2 

15 sample, a waveform having a value of 0 is compensated for by 

16 the N/2 sample (in the middle in Fig. 8) . To perform an 

17 MDCT for the last N/2 sample, a wave form having a value of 

18 0 is compensated for by the N/2 sample (on the right in Fig. 

19 8) . In this example, the MDCT is performed for the first 

20 (last) half of the waveform, and the obtained MDCT 

21 coefficients sequence are represented by V fore , r ,b,k (V back , r , b , k ) . 

22 Since the MDCT possesses linearity, the original MDCT 

23 coefficient sequence V r , b/k is equal to the sum of the vectors 

24 Vfore,r,b,k and V b ack,r,b,k- 

25 When the basis is separated in this manner, V fore ,r,b,k and 

26 V back , r , b , k can be used in common even for the portion wherein 
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1 V r , b , k can not be standardized using method 1. For example, 

2 in Fig. 5, method 1 can not be applied for Block 1 because 

3 b=l- However, if each block is separated into first and 

4 last segments, the signs are merely inverted for the MDCT 

5 coefficient sequence V backfra , k for Block 1 and the MDCT 

6 coefficient sequence V back , r , 2 , k for Block 2. Therefore, one of 

7 the MDCT coefficients sequence need not be stored. This can 

8 also be applied for V fore , r , 2 ,k/ for Block 2, and V for e,r,3,k/ for 

9 Block 3. Vfore,r,i,k/ for Block 1, and V fore , r , 3fk , for Block 3 are 

10 always zero vectors. 

11 The processing for generating a table using the above method 

12 is as follows. 

13 Step 1: The same as when the basis is not separated into 

14 first and second segments. 

15 Step 2: The same as when the basis is not separated into 

16 first and second segments. 

17 Step 3: First, the "fore" coefficients are prepared. The 

18 (N/2x (b-1) ) -th to the (N/2xb) -th coefficients are extracted, 

19 and the N/2 sample having a value of 0 is added after them. 



20 



h f0 re,b(z) = g(z+N/2x(b-l) ) (0^z<N/2) 



21 



0 



(N/2^z<N) 



22 Step 4: A window function is multiplied. 
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1 h fore ,b(z) = h fore , b (2) xwin (z) 



2 



(0^z<N, win(z) is a window function) 



3 Step 5: The MDCT process is performed, and the obtained N/2 

4 MDCT coefficients are defined as vector V fore , r , b , k . 

5 V fore , r , b , k = MDCT (h fore ,b ( z) ) . 

6 Step 6: Next, the "back" coefficients are prepared. The 

7 (N/2xb)-th to the (N/2x (b+1 ) ) -th coefficients are extracted, 

8 and the N/2 sample having a value of 0 is added before them. 

9 h back/b (z) = 0 (0gz<N/2) 

10 g(z+N/2x(b-l) ) (N/2^z<N) 

11 Step 7: A window function is multiplied. 

12 h back , b (z) = h back , b (z) xwin (z) 

13 (0gz<N, win(z) is a window function) 

14 Step 8: The MDCT process is performed, and the obtained N/2 

15 MDCT coefficients are defined as vector V back , r , b , k . 

16 V back , r , b , k = MDCT (h baC k,b ( z ) ) . 
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1 Step 9: V for e, r/ b,k and V back , r , b , k are calculated for all the 

2 combinations (k,b), and the matrixes T fore/r ,b and T back , rfb are 

3 formed. 

4 Tf ore , r/b = ( Vf ore , r,b,l / Vfore, r,b,2 / • • • Vf ore , r , b , N /2 ) 

5 T back/r , b = ( Vback, r,b,l r Vback,r,b,2/ . - . V back/ r ,b,N/2 ) 

6 In accordance with the linearity of the MDCT, 

7 V r/b ,k = Vf 0r e, r,b,k + Vback,r,b,k/ 

8 and 

9 Tr,b Tfore,r,b Tback,r,b • 

10 In accordance with this characteristic, for the transform 

11 between the MDCT domain and the frequency domain, only an 

12 operation equivalent to the operation performed using the 

13 T r , b need be performed by using T fore ,r,b and T back , r , b . 

14 The periodicity of the basis is employed under these 

15 definitions, 

16 in case a where (kxm)/n is an integer, and under the 

17 condition where b+m=n+l, 

18 h fore ,n+i(z) == h f0 re,b(z) is established. This is because the 

19 second half of h for e,bU) has a value of 0. Thus, the 
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application range for the following equation is expanded, 
and 

hfore,b+m ( Z ) == hf or e,b ( Z ) 

(limited to a range of 2gb^n-m+l) . 

Thus, 

Vfore, r, b+m,k == Vfore,r,b,k 

(limited to a range of 2^b^n-m+l ) , 
and the portions used in common are increased. For V back ,r,b,k, 



hback,m+l{z} h ba ck,l(z) 

is established even under the condition where b=l. This is 

because the first half of l(z) has a value of zero. The 

application range for the following equation is expanded, 
and 

hback,b+m ( 2 ) == h ba ck,b(z) 

(limited to a range of l^bgn-m) . 

Therefore, 
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1 



Vback, r, ta+m, k Vback,r,b,k 



2 



(limited to a range of l^bgn-m+1) , 



3 and the portions used in common are increased. The same 

4 range limitation is provided for the cases b, c and d. 

5 Method 3: approximating method 

6 The final method for reducing the table involves the use of 

7 an approximation. Among the MDCT coefficients sequence that 

8 correspond to one basis waveform of a Fourier transform, an 

9 MDCT coefficient that is smaller than a specific value can 

10 approximate zero, and no actual problem occurs. A threshold 

11 value used for the approximation is appropriately selected 

12 by a trade off between the transform precision and the 

13 memory capacity. When the individual systems are so 

14 designed that they do not perform a matrix calculation for 

15 the portion that approximates zero, the calculation time can 

16 also be reduced. 

17 Furthermore, when all the coefficients, including large 

18 coefficients, approximate rational numbers, which are then 

19 quantized, the coefficients can be stored as integers, not 

20 as floating-point numbers, so that a savings in memory 

21 capacity can be realized. 

22 Correlation table generator 
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1 Information concerning the window is received, and the table 

2 is generated and output. As well as the method for 

3 generating the correlation table, the information concerning 

4 the window includes the frame length N, the length n of a 

5 block corresponding to the frame, the offset of the first 

6 window, the window function, and "W" for regulating the 

7 window length. Basically, the number of tables that are 

8 generated is equivalent to the number of window types used 

9 in the target sound compression technique. 

10 Additional information embedding system 

11 Fig. 9 is a block diagram illustrating an additional 

12 information embedding system according to the present 

13 invention. An MDCT coefficient recovery unit 210 recovers 

14 sound MDCT coefficients sequence, and window and other 

15 information from compressed audio data that are entered. 

16 These data are extracted (recovered) using Huffmann 

17 decoding, inverse quantization and a prediction method, 

18 which are designated in the compressed audio data. An 

19 MDCT/DFT transformer 230 receives the sound MDCT 

20 coefficients sequence and the window information that are 

21 obtained by the MDCT coefficient recovery unit 210, and 

22 employs a table 900 to transform these data into a frequency 

23 component. A frequency domain embedding unit 250 embeds 

24 additional information in the frequency component that is 

25 obtained by the MDCT/DFT transformer 230. 

26 In accordance with the window information extracted by the 
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1 MDCT coefficient recovery unit 210, a DFT/MDCT transformer 

2 240 employs the table 900 to transform, into MDCT 

3 coefficients sequence, the resultant frequency components 

4 that are obtained by the frequency domain embedding unit 

5 250. Finally, an MDCT coefficient compressor 220 compresses 

6 the MDCT coefficients obtained by the DFT/MDCT transformer 

7 240, as well as the window information and the other 

8 information that are extracted by the MDCT coefficient 

9 recovery unit 210. The compressed audio data are thus 

10 obtained. The prediction method, the inverse quantization 

11 and the Huffmann decoding, which are designated in the 

12 window information and the other information, are employed 

13 for the data compression. Through this processing, the 

14 additional information is embedded so it corresponds to the 

15 operation of the frequency component, and so that even after 

16 decompression additional information can be detected using 

17 the conventional frequency domain detection method. 

18 Additional information detection system 

19 Fig. 10 is a block diagram illustrating an additional 

20 information detection system according to the present 

21 invention. An MDCT coefficient recovery unit 210 recovers 

22 sound MDCT coefficients sequence, window information and 

23 other information from compressed audio data that are 

24 entered. These data are extracted (recovered) using 

25 Huffmann decoding, inverse quantization and a prediction 

26 method, which are designated in the compressed audio data. 

27 An MDCT/DFT transformer 230 receives the sound MDCT 
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1 coefficients sequence and the window information that are 

2 obtained by the MDCT coefficient recovery unit 210, and 

3 employs a table 900 to transform these data into frequency 

4 components. Finally, a frequency domain detector 310 

5 detects additional information in the frequency components 

6 that are obtained by the MDCT/DFT transformer 230, and 

7 outputs the additional information. 

8 Additional information updating system 

9 Fig. 11 is a block diagram illustrating an additional 

10 information updating system according to the present 

11 invention. 

12 An MDCT coefficient recovery unit 210 recovers sound MDCT 

13 coefficients sequence, window information and other 

14 information from compressed audio data that are entered. 

15 These data are extracted (recovered) using Huffmann 

16 decoding, inverse quantization and a prediction method, 

17 which are designated in the compressed audio data. 

18 An MDCT/DFT transformer 230 receives the sound MDCT 

19 coefficients sequence and the window information that are 

20 obtained by the MDCT coefficient recovery unit 210, and 

21 employs a table 900 to transform these data into frequency 

22 components, 

23 A frequency domain updating unit 410 first determines 

24 whether additional information is embedded in the frequency 
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1 components obtained by the MDCT/DFT transformer 230. If 

2 additional information is embedded therein, the frequency 

3 domain updating unit 410 further determines whether the 

4 contents of the additional information should be changed. 

5 Only when the contents of the additional information should 

6 be changed is the updating of the additional information 

7 performed for the frequency components (the determination 

8 results may be output so that a user of the updating unit 

9 410 can understand it) . 

10 In accordance with the window information extracted by the 

11 MDCT coefficient recovery unit 210, a DFT/MDCT transformer 

12 240 employs the table 900 to transform, into MDCT 

13 coefficients sequence, the frequency components that have 

14 been updated by the frequency domain updating unit 250. 

15 Finally, an MDCT coefficient compressor 220 compresses the 

16 MDCT coefficients sequence obtained by the DFT/MDCT 

17 transformer 240, as well as the window information and the 

18 other information that are extracted by the MDCT coefficient 

19 recovery unit 210. The compressed audio data are thus 

20 obtained. The prediction method, the inverse quantization 

21 and the Huffmann decoding, which are designated in the 

22 window and the other information, are employed for the data 

23 compression. 

24 General hardware arrangement 

25 The apparatus and the systems according to the present 
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1 invention can be carried out by using the hardware of a 

2 common computer. Fig. 12 is a diagram illustrating the 

3 hardware arrangement for a general personal computer. A 

4 system 100 comprises a central processing unit (CPU) 1 and a 

5 main memory 4. The CPU 1 and the main memory 4 communicate, 

6 via a bus 2 and an IDE controller 25, with a hard disk drive 

7 (HDD) 13, which is 'an auxiliary storage device (or a storage 

8 medium drive, such as a CD-ROM 26 or a DVD 32) . Similarly, 

9 the CPU 1 and the main memory 4 communicate, via a bus 2 and 

10 a SCSI controller 27, with a hard disk drive 30, which is an 

11 auxiliary storage device (or a storage medium drive, such as 

12 an MO 28, a CD-ROM 29 or a DVD 31). A floppy disk drive 

13 (FDD) 20 (or an MO or a CD-ROM drive) is connected to the 

14 bus 2 via a floppy disk controller (FDC) 19. 

15 A floppy disk is inserted into the floppy disk drive 20. 

16 Stored on the floppy disk and the hard disk drive 13 (or the 

17 CD-ROM 2 6 or the DVD 32) are a computer program, a web 

18 browser, the code for an operating system and other data 

19 supplied in order that instructions can be issued to the CPU 

20 1, in cooperation with the operating system and in order to 

21 implement the present invention. These programs, code and 

22 data are loaded into the main memory 4 for execution. The 

23 computer program code can be compressed, or it can be 

24 divided into a plurality of codes and recorded using a 

25 plurality of media. The programs can also be stored on 

26 another a storage medium, such as a disk, and the disk can 

27 be driven by another computer. 
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1 The system 100 further includes user interface hardware. 

2 User interface hardware components are, for example, a 

3 pointing device (a mouse, a joy stick, etc.) 7 or a keyboard 

4 6 for inputting data, and a display (CRT) 12. A printer, 

5 via a parallel port 16, and a modem, via a serial port 15, 

6 can be connected to the communication terminal 100, so that 

7 it can communicate with another computer via the serial port 

8 15 and the modem, or via a communication adaptor 18 (an 

9 ethernet or a token ring card) . A remote transceiver may be 

10 connected to the serial port 15 or the parallel port 16 to 

11 exchange data using ultraviolet rays or radio. 

12 A loudspeaker 23 receives, through an amplifier 22, sounds 

13 and tone signals that are obtained through D/A 

14 (digital-analog) conversion performed by an audio controller 

15 21, and releases them as sound or speech. The audio 

16 controller 21 performs A/D (analog/digital) conversion for 

17 sound information received via a microphone 24, and 

18 transmits the external sound information to the system. The 

19 sound may be input at the microphone 24, and the compressed 

20 data produced by this invention may be generated based on 

21 the sound that is input. 

22 It would therefore be easily understood that the present 

23 invention can be provided by employing an ordinary personal 

24 computer (PC), a work station, a notebook PC, a palmtop PC, 

25 a network computer, various types of electric home 

26 appliances, such as a computer-incorporating television, a 

27 game machine that includes a communication function, a 



DOCKET NUMBER: JA919990075US1 



-44- 



1 telephone, a facsimile machine, a portable telephone, a PHS, 

2 a PDA, another communication terminal, or a combination of 

3 these apparatuses. The above described components, however, 

4 are merely examples, and not all of them are required for 

5 the present invention. 

6 Advantages of the Invention 

7 According to the present invention, provided is a method and 

8 a system for embedding, detecting or updating additional 

9 information embedded in compressed audio data, without 

10 having to decompress the audio data. Further, according to 

11 the method of the invention, the additional information 

12 embedded in the compressed audio data can be detected using 

13 a conventional watermarking technique, even when the audio 

14 data have been decompressed. 

15 The present invention can be realized in hardware, software, 

16 or a combination of hardware and software. The present 

17 invention can be realized in a centralized fashion in one 

18 computer system, or in a distributed fashion where different 

19 elements are spread across several interconnected computer 

20 systems. Any kind of computer system - or other apparatus 

21 adapted for carrying out the methods described herein - is 

22 suitable. A typical combination of hardware and software 

23 could be a general purpose computer system with a computer 

24 program that, when being loaded and executed, controls the 

25 computer system such that it carries out the methods 

26 described herein. The present invention can also be embedded 



DOCKET NUMBER: JA919990075US1 



-45- 



# • 



1 in a computer program product, which comprises all the 

2 features enabling the implementation of the methods described 

3 herein, and which - when loaded in a computer system - is 

4 able to carry out these methods. 

5 Computer program means or computer program in the present 

6 context mean any expression, in any language, code or 

7 notation, of a set of instructions intended to cause a system 

8 having an information processing capability to perform a 

9 particular function either directly or after conversion to 

10 another language, code or notation and/or reproduction in a 

11 different material form. 

12 It is noted that the foregoing has outlined some of the more 

13 pertinent objects and embodiments of the present invention. 

14 This invention may be used for many applications. Thus, 

15 although the description is made for particular arrangements 

16 and methods, the intent and concept of the invention is 

17 suitable and applicable to other arrangements and 

18 applications. It will be clear to those skilled in the art 

19 that other modifications to the disclosed embodiments can be 

20 effected without departing from the spirit and scope of the 

21 invention. The described embodiments ought to be construed 

22 to be merely illustrative of some of the more prominent 

23 features and applications of the invention. Other beneficial 

24 results can be realized by applying the disclosed invention 

25 in a different manner or modifying the invention in ways 

26 known to those familiar with the art. 
27 
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